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1 Introduction 

Recovery of signals with low-dimensional structures, in particular, sparsity 
[4] , block-sparsity [t] , and low-rankness |3] , has found numerous applications in 
signal sampling, control, inverse imaging, remote sensing, radar, sensor arrays, 
image processing, computer vision, and so on. Mathematically, the recovery 
of signals with low-dimensional structures aims to reconstruct a signal with a 
prescribed structure, usually from noisy linear measurements, as follows: 

y = Ax + w, (1) 

where a; e is the signal to be reconstructed, y G M™ is the measurement 
vector, A S M™^^ is the sensing/measurement matrix, and w G M™ is the 
noise. For example, a sparse signal is assumed to have only a few non-zero coef- 
ficients when represented as a linear combination of atoms from an orthogonal 
basis or from an overcomplete dictionary. Similarly, for a block-sparse signal, 
the non-zero coefficients are assumed to cluster into blocks. 

A theoretically justified way to exploit the low-dimensional structure in re- 
covering X is to minimize a convex function that is known to enforce that low- 
dimensional structure. Examples include using the £1 norm to enforce sparsity, 
block-£i norm (or £2/^1 norm) to enforce the block-sparsity, and the nuclear 
norm to enforce the low-rankness. The performance of these convex enforce- 
ments is usually analyzed using variants of the restricted isometry property 
(RIP) [2|[3j|7]. Upper bounds on the £2 norm of the error vectors for various 
recovery algorithms have been expressed in terms of the RIP. Unfortunately, it 
is extremely difficult to verify that the RIP of a specific sensing matrix satisfies 
the conditions for the bounds to be valid, and even more difficult to directly 
compute the RIP itself. Actually, the only known sensing matrices with nice 
RIPs are certain types of random matrices [oj . 

In this paper, we investigate the recovery performance for block-sparse sig- 



nals. Block-sparsity arises naturally in applications such as sensor arrays 12 



radar 18 , multi-band signals 14 , and DNA microarrays 16 . A particular 
area that motivates this work is the application of block-sparse signal recovery 
in radar systems. The signals in radar applications are usually sparse because 
there are only a few targets to be estimated among many possibilities. How- 
ever, a single target manifests itself simultaneously in the sensor domain, the 
frequency domain, the temporal domain, and the reflection-path domain. As 
a consequence, the underlying signal would be block-sparse when the radar 



system observes the targets from several of these domains 17 18 . 

The aforementioned applications require a computable performance analy- 
sis of block-sparsity recovery. While it is perfectly reasonable to use a random 
sensing matrix for signal sampling, the sensing matrices in other applications 
are far from random and actually depend on the underlying measurement de- 
vices and the physical processes that generate the observations. Due to the 
computational challenges associated with the RIP, it is necessary to seek com- 
putationally more amenable goodness measures of the sensing matrices. Com- 
putable performance measures would open doors for wide applications. They 
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would provide a means to pre-determine the performance of the sensing sys- 
tem before its implementation and the taking of measurements. In addition, in 
applications where we have the flexibility to choose the sensing matrix, com- 
putable performance analysis would form a bisis for optimal sensing matrix 
design. 

We preview our contributions. First of all, we define a family of goodness 
measures of the sensing matrix, and use them to derive performance bounds 
on the block-^oo norm of the recovery error vector. Performance bounds using 
other norms are expressed using the block-^oo norm. Our preliminary numer- 
ical results show that these bounds are tighter than the block RIP based 
bounds. Second and most important, we develop a fixed point iteration frame- 
work to design algorithms that efficiently compute lower bounds on the good- 
ness measures for arbitrary sensing matrices. Each fixed point iteration solves 
a series of semidefinite programs. The fixed point iteration framework also 
demonstrates the algorithms' convergence to the global optima from any ini- 
tial point. As a by-product, we obtain a fast algorithm to verify the sufficient 
condition guaranteeing exact block-sparsity recovery via block-£i minimiza- 
tion. Finally, we show that the goodness measures are non-degenerate for 
subgaussian and isotropic random sensing matrices as long as the number 
of measurements is relatively large, a result parallel to that of the block RIP 
for random matrices. 



This work extends verifiable and computable performance analysis from 
sparse signal cases [5l|6j[9 21 to block-sparse signal cases. There are several 
technical challenges that are unique to the block-sparse case. In the sparse 
setting, the optimization subproblems are solved exactly by linear program- 
ming or second-order cone programming. However, in the block-sparse setting, 
the associated subproblems are not readily solvable and we need to develop 
semidefinite relaxations to compute an upper bound on the optimal values of 
the subproblems. The bounding technique complicates the situation as it is not 
clear how to obtain bounds on the goodness measures from these subproblem 
bounds. We develop a systematic fixed point theory to address this problem. 



The rest of the paper is organized as follows. In Section [2] we introduce 
notations and we present the measurement model, three convex relaxation 
algorithms, and the sufficient and necessary condition for exact block-^i re- 
covery. In section [Sj we derive performance bounds on the block-^oo norms of 
the recovery errors for several convex relaxation algorithms. Section |4] is de- 
voted to the probabilistic analysis of our block-£oo performance measures. In 
Section [5] we design algorithms to verify a sufficient condition for exact block- 
£i recovery in the noise-free case, and to compute the goodness measures of 
arbitrary sensing matrices through fixed point iteration, bisection search, and 
semidefinite programming. We evaluate the algorithms' performance in Section 
m Section [3 summarizes our conclusions. 
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2 Notations, Measurement Model, and Recovery Algorithms 



In this section, we introduce notations and the measurement model, and 
present three block-sparsity recovery algorithms. 

For any vector x G M"^, we partition the vector into p blocks, each of length 
n. More precisely, we have i, , with the ith block Xi £ M". 

The block-£g norms for 1 < q < oo associated with this block-sparse structure 
are defined as 



m\hq = 




1 < g < DO 



(2) 



and 



a; boo 



max \\Xi 
i<i<p 



b, q ■ 



(3) 



The canonical inner product in IR"^ is denoted by (•,•), and the £2 (or Eu- 
clidean) norm is ||a;||2 = y/ {x, x). Obviously, the block-^2 norm is the same as 
the ordinary £2 norm. We use || • jj^ to denote a general norm, and use ® to 
denote the Kronecker product. 

The block support of a; e M"^, bsupp(a;) = {i : ||a;i||2 7^ 0}, is the index 
set of the non-zero blocks of x. The size of the block support, denoted by 
the block- i'o "norm" ||a?||bo, is the block-sparsity level of x. Signals of block- 
sparsity level of at most k are called fc— block-sparse signals. If 5* C {1, • • • ,p} 
is an index set, then \S\ is the cardinality of 5, and xs £ M"''^' is the vector 
formed by the blocks of x with indices in S. 

We use Bi, 0, O, 1, and In to denote respectively the ith canonical basis 
vector, the zero column vector, the zero matrix, the column vector with all 
ones, and the identity matrix of size N x N. 

Suppose a; is a /c— block-sparse signal. In this paper, we observe x through 
the following linear model: 



y 



Ax + w. 



(4) 



where A e ]]J™x"p jg the measurement /sensing matrix, y is the measurement 
vector, and w is noise. A very special block-sparse model is when a; is a 
complex signal, such as the models in sensor arrays and radar applications. 



Note that the computable performance analysis developed in 20 21 for the 
real variables can not apply to the complex case. 

Many algorithms in sparse signal recovery have been extended to recover 
the block-sparse signal x from y by exploiting the block-sparsity of x. We 
focus on three algorithms based on block-£i minimization: the Block-Sparse 
Basis Pursuit (BS-BP) (?], the Block-Sparse Dantzig selector (BS-DS) [u]. 
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and the Block-Sparse LASSO estimator (BS-LASSO) [24]. 

BS-BP: min ll^llbi sX. \\y - Azh < £ (5) 
BS-DS: min ll^llbi s.t. \\A^ {y - Az)\\^,^ < ^ji (6) 

BS-LASSO: min ^||y - + ^||2||bi. (7) 

Here ^ is a tuning parameter, and £ is a measure of the noise level. All three 
optimization problems have efficient implementations using convex program- 
ming. 

In the noise-free case where w — roughly speaking, all three algorithms 
reduce to 

min ll^llbi s.t. Az — Ax, (8) 
which is the block-^i relaxation of the block-^o problem: 

min ||2;||bo s.t. Az = Ax. (9) 

zSR" 

A minimal requirement on the block-i?i minimization algorithms is the unique- 
ness and exactness of the solution x '= argmin^.^^^^g^HajUbi, i.e., x = x. 
When the true signal x is fc— block-sparse, the sufficient and necessary condi- 



tion for exact block-^i recovery is 19 



W^rh < E W'^^hy^ e Ker(^), 1^1 < k, (10) 

ies i(^s 

where Ker(A) {z : Az — 0} is the kernel of A, 5* C {1, ... ,p} is an index 
set, and Zi is the ith block of z of size n. 



3 Performance Bounds on the Block-£oo Norms of the Recovery 
Errors 

In this section, we derive performance bounds on the block-£oo norms of the 
error vectors. We first establish a proposition characterizing the error vectors 
of the block-£i recovery algorithms, whose proof is given in Appendix |8.1[ 

Proposition 1 Suppose x e M"^' in Q is k^ block-sparse and the noise w 
satisfies \\w\\2 < e, || A'^t(;||boo < t^, (^nd \\A'^w\\^ < Kpi,K e (0,1), for the 
BS-BP, the BS-DS, and the BS-LASSO, respectively. Define h = x — x as the 
error vector for any of the three block-li recovery algorithms ([5|, and Q. 
Then we have 

||/l5||bl > ||/l||bl/c, (11) 

where S — bsupp(a;), c — 2 for the BS-BP and the BS-DS, and c — 2/(1 — k) 
for the BS-LASSO. 
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An immediate corollary of Proposition [T] is to bound the block-^i and I2 
norms of the error vectors, using the block-£oo norm. The proof is given in 
Appendix [O 



Corollary 1 Under the assumptiocoroUaryns of Proposition^ we have 

\\h\\u < cfc||/i||boo, and (12) 

||/l|l2 < V^||/l||boo. (13) 

Furthermore, if S — bsupp(a;) and j3 = min^gs ||a;i||2, then ||ft.||boo < /3/2 
implies 

{*:|li,|l2>/?/2} = bsupp(a;), (14) 

i.e., a thresholding operator recovers the signal block- support. 

For ease of presentation, we introduce the following notation: 
Definition 1 For any s G [l,p] and matrix A e W^'-'^'^p ^ define 



a;o(Q,s)= min -— , (15) 

2:||2||bl/||z||b.=c<s ll^llboo 

where Q is either A or A^A. 

Now we present the error bounds on the block-^oo norm of the error vec- 
tors for the BS-BP, the BS-DS, and the BS-LASSO, whose proof is given in 
Appendix |8.3| 



Theorem 1 Under the assumption of Proposition^ we have 

2e 

uj2[A, 2k) 



for the BS-BP, 



for the BS-DS, and 



'""""^-- a.boo(A?A2fc) ('^^ 



(1 + 

for the BS-LASSO. 

A consequence of Theorem [l] and Corollary [T] is the error bound on the £2 
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Corollary 2 Under the assumption of Proposition [7} the £2 norms of the 
recovery errors are bounded as 



uj2{A, 2k) 



for the BS-BP, 



\x ~ CC 2 < 



a;boo(^^^,2fc) 



(20) 



for the BS-DS, and 



\X - x\\-2 < 



2k 



(1 + K)/i 



l-KCJboo(A^A,2fc/(l-K)) 



(21) 



for the BS-LASSO. 

One of the primary contributions of this work is the design of algorithms 
that compute uJo{A,s) and a;boo(^"^^, s) efficiently. The algorithms provide 
a way to numerically assess the performance of the BS-BP, the BS-DS, and 
the BS-LASSO according to the bounds given in Theorem [T] and Corollary [2] 
According to Corollary [T] the correct recovery of signal block-support is also 
guaranteed by reducing the block-€oo norm to some threshold. In Section [4j 
we also demonstrate that the bounds in Theorem [l] are non-trivial for a large 
class of random sensing matrices, as long as m is relatively large. Numerical 
simulations in Section [6] show that in many cases the error bounds on the £2 
norms based on Corollary [2] are tighter than the block RIP based bounds. 
Before we turn to the computation issues, we first establish some results on 
the probability behavior of Wo(Q, s). 



4 Probabilistic Behavior of u;o(Q, s) 

In this section, we analyze how good are the performance bounds in The- 
orem [1] for random sensing matrices. For this purpose, we define the block 
£i-constrained minimal singular value (block €i-CMSV), which is an exten- 
sion of the ^i-CMSV concept in the sparse setting [20] : 

Definition 2 For any s e and matrix A £ ^mxnp^ define the block 

£i-constrained minimal singular value (abbreviated as block ^i-CMSV) of A 

by 



(22) 
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The most important difference between Ps{A) and oJ^iQ, s) is the replace- 
ment of II • 1 1 boo with II • II 2 in the denominators of the fractional constraint and 
the objective function. The Euclidean norm || • II2 is more amenable to proba- 
bilistic analysis. The connections between Ps{A), uj2{A,s), and Wboo 
established in the following lemma, allow us to analyze the probabilistic be- 
haviors of (jJo{Q, s) using the results for Ps{A) which we are going to establish 
later. 



Lemma 1 



{ATA,s)>oo2{A,s)>p,2{A). (23) 



For a proof, see Appendix |8.4[ 

Next we derive a condition on the number of measurements needed to get 
Ps {A) bounded away from zero with high probability for sensing matrices with 
i.i.d. subgaussian and isotropic rows. Note that a random vector X € W^p is 
called isotropic and subgaussian with constant L if E| {X,u) p ~ ll^ll^ and 
P(| {X,u) \>t)< 2exp(-tV(i||M||2)) hold for any u e 

Theorem 2 Let the rows of ypmA he i.i.d. subgaussian and isotropic random 
vectors with numerical constant L. Then there exist constants C\ and Ci such 
that for any e > and m > 1 satisfying 

L^{sn + s\ogp) 
TO > ci ^ , (24) 

we have 

Ep,(A)>l-e, (25) 

and 

P{/Os(^) > 1 - e} > 1 - cxp(-C2e2TO/L''). (26) 



For a proof, see Appendix |8.5[ 

Using Ps{A), we could equally develop bounds similar to those of Theorem 
[1] on the ^2 norm of the error vectors. For example, the error bound for the 
BS-BP would look like 

\\i-xh<^^y (27) 

P2k[A) 

The conclusion of Theorem [2j combined with the previous equation, implies 
that we could stably recover a block sparse signal using BS-BP with high 
probability if the sensing matrix is subgaussian and isotropic and to > c{kn + 
k\ogp)/e^. If we do not consider the block structure in the signal, we would 



need to > c{kn\og{np)) / measurements, as the sparsity level is kn 20 
Therefore, the prior information regarding the block structure greatly reduces 
the number of measurements necessary to recover the signal. The lower bound 
on TO is essentially the same as the one given by the block RIP (See [7j Propo- 
sition 4]). 
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Theorem 3 Under the assumptions and notations of Theorem^ there exist 
constants c\ and C2 such that for any e > and m > 1 satisfying 

L^{s^n + s^ logp) 
m > ci ^ , (28) 



E uj2iA,s) > 1 - e, and (29) 
P{uj2{A, s) > 1 - e} > 1 - exp(-C2e^m/L^), (30) 



(I _ g)2 

E ujboo(A^A,s) > and (31) 

s 

> 1 - exp(-C2e2m/L''). (32) 



Equation ( 28 1 and Theorem [T] imply that for exact signal recovery in the 
noise-free case, we need 0{s^{n + logp)) measurements for random sensing 
matrices. The extra s suggests that the Wo based approach to verify exact 
recovery is not as good as the one based on ps- However, Wo is computationally 
more amenable, as we are going to see in Section [Sj The measurement bound 



(p8) also implies that the algorithms for verifying > and for computing 
work for s at least up to the order yjrajin + logp). 
Finally, we comment that sensing matrices with i.i.d. subgaussian and 
isotropic rows include the Gaussian ensemble, the Bernoulli ensemble, and 
normalized volume measure on various convex symmetric bodies, for example, 
the unit balls of for 2 < g < cx) [13 



5 Verification and Computation of 

In this section, we consider the computational issues of a;o(-)- We will present 
a very general algorithm and make it specific only when necessary. For this 
purpose, we use Q to denote either A or A^A. 



5.1 Verification of > 

A prerequisite for the bounds in Theorem [T] to be valid is the positiveness of 
the involved Wo(-). We call the validation of (u;o(-) > the verification problem. 
Note that from Theorem [l] Wo(-) > implies the exact recovery of the true 
signal X in the noise-free case. Therefore, verifying aJo(') > is equivalent to 
verifying a sufficient condition for exact block-^i recovery. 

Verifying Wo(Q, s) > amounts to making sure H^Hbi/ll^llboo < s for all z 
such that Qz = 0. Therefore, we compute 

s* = min s.t. Qz = 0. (33) 
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Then, when s < s*, we have cooiQ, s) > 0. The following theorem presents an 
optimization procedure that computes a lower bound on s* . 

Proposition 2 The reciprocal, denoted by s*, of the optimal value of the fol- 
lowing optimization, 



max min max || (^ijln 

i Pi j 



PiQ. 



J 112 



(34) 



is a lower hound on s* . Here P is a matrix variable of the same size as Q, 
Sij = 1 for i — j and otherwise, and P = [Pi, ... , Pp], Q = [Qi, . . . , Qp] with 
Pi and Qj having n columns each. 

The proof is given in Appendix |8.6[ 

Because < s*, the condition s < is a sufficient condition for > 
and for the uniqueness and exactness of block-sparse recovery in the noise-free 
case. To get s*, for each i, we need to solve 



minmax \\Sijln - Pi Qjh- 
Pi j 



(35) 



A semidefinite program equivalent to ( 35 1 is given as follows: 



mini s.t. ||(5.yl„ - Qj\\2 < t,j :== 1, . . . ,p. 



mini s.t. 

Pi,t 



il S- T - P^O 



6 1 ~O^P 



ti„ 



(36) 

^0,j = l,...,p. (37) 



Small instances of (36) and (37) can be solved using CVX 1]. However, for 
medium to large scale problems, it is beneficial to use first-order techniques to 

PfQj\\2 can be expressed 



solve (35) directly. We observe that maxj ||(5ijl„ 



as the largest eigenvalue of a block-diagonal matrix. The smoothing technique 



for semidefinite optimization developed in 15 can be used to minimize the 



largest eigenvalue with respect to P^. We leave these implementations in future 
work. 

Due to the equivalence of A'^Az — and Az — 0, we always solve (33) for 
Q — A and avoid Q = A. The former apparently involves solving semidefi- 
nite programs of smaller size. In practice, we usually replace A with the matrix 
with orthogonal rows obtained from the economy-size QR decomposition of 



5.2 Fixed Point Theory for Computing Wo(-) 

We present a general fixed point procedure to compute w^. Recall that the 
optimization problem defining uj^ is as follows: 

\ • llQ^lU , ll^llbl ^ /OQN 

Wo(Q, s) = mm s.t. J— < s, (38) 

^ boo -2 boo 
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or equivalently, 

— ^ =max||;z||boo s.t. \\Qz\\^<l,pp^ <s. (39) 

For any s G (1, s*), we define a function over [0, oo) parameterized by s 

fs{ri) = max{||2;||boo : \\Qz\U < 1, ll^llbi < srj} . (40) 

z 

We basically replaced the ||2;||boo in the denominator of the fractional con- 
straint in (39) with rj. It tmns out that the unique positive fixed point of fsii]) 



is exactly l/a;o((3, s), as shown by the following proposition. See Appendix 8.7 
for the proof. 

Proposition 3 The function fg (jf) has the following properties: 

1. fs{ii) is continuous in rj. 

2. fs{fj) is strictly increasing in rj. 

3- fs{0) = 0, fsiv) > srj > rj for .sufficiently small rj > 0, and there exists 
p < I such that fs(rj) < prj for sufficiently large rj. 

4- fsiv) ^''■s unique positive fixed point rj* — fs{v*) that is equal to 1 / u!o{Q , s) . 

5. For rj G (0, 77*), we have fsiv) > Vj and for rj £ {rj* , go), we have fs{rj) < rj. 

6. For any e > 0, there exists pi(e) > 1 such that fs{rj) > pi{e)rj as long as 
< rj < [1 — e)rj* , and there exists P2{e) < 1 such that fs{rj) < P2i^)v as 
long as rj > (1 + e)rj* . 

We have transformed the problem of computing LO(i{Q, s) into one of finding 
the positive fixed point of a one-dimensional function fsi^j). Property 6) of 
Proposition [3] states that we could start with any 770 and use the iteration 

Vt+i^ fs{vt),t = 0,1,- ■■ (41) 

to find the positive fixed point rj* . In addition, if we start from two initial 
points, one less than rj* and one greater than 77*, then the gap between the 
generated sequences indicates how close we are to the fixed point rj* . 

Property 5) suggests finding rj* by bisection search. Suppose we have an 
interval (?7l,?/u) that includes rj* . Consider the middle point ?]m = 2L±au^ jf 
fsivu) < »7M: we conclude that rj* < rju and we set rju = /(?7m); if fsivu) > 
rjM, we conclude that rj* > rju and we set ?7l — /(?7m)- We continue this 
bisection procedure until the interval length rju — rji^ is sufficiently small. 



5.3 Relaxation of the Subproblem 

Unfortunately, except when n = 1 and the signal is real, i.e., the real sparse 
case, it is not easy to compute fs{rj) according to (40). In the following theo- 
rem, we present a relaxation of the subproblem 

max||2;||boo s.t. HQ^IU < 1, H^Hbi < s?7 (42) 

z 

by computing an upper bound on fs (rj) . This proof is similar to that of Propo- 
sition [2] and is given in Appendix |8.8| 
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Proposition 4 When Q = A and o = 2, we have 



fsiv) < maxminmaxsr?||5ijl„ - PfQj\\2 + \\Pi\\2,and 

J Pi j 

when Q = A^A and o = hoo, we have 



(43) 



fsiv) < maxniinniaxsr?||(5ijl„ - PfQjh + ll^i lb- (44) 

i Pi j 

1=1 

Here Pi (resp. Qj) is the submatrix of P (resp. Q ) formed by the {i — l)n+lth 
to inth columns (resp. {j— l)n+lth to jnth columns), and Pi is the submatrix 
of P formed by the {i — l)n + 1th to inth columns and the {I — l)n + 1th to 
Inth rows. 

For each i = 1,. . . ,p, the optimization problem 

min max sr/ 1 1 - PfQj\\2 + \\Pi\\2 (45) 

Pi j 

can be solved using scmiclcfinite programming: 

min sr]to + tis.t. \\Sijln - PfQjh < to,j = 1, ■ ■ . ,P,\\Pi\\2 < h, 

Piyto^ti 

min snto + ti s.t. 

Pi,to,ti 



^ijlji Qj Pi ^O-^n 



h o,i = i,...,_p, 



PI 



>- 0. 



(46) 



Similarly, the optimization problem 



minmaxs77||(5ijl„ - PfQjh + ^ H-P/ lb 

' ^ 1=1 

can be solved by the following semidefinite program: 



(47) 



min srjtn + ^ti s.t. \\Sijln - PfQjh <to,j = 1,.-. ,P, \\Pi\\2 < ti,l 

Pi,to,tl,...,tp J — ^ 



^ mill STito + >^ ti s.t. 



1=1 



tlln Pi 
PF tlln 



h 0,i = l,...,p, 

h 0,1 = I, 

These semidefinite programs can also be solved using first-order techniques. 
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5.4 Fixed Point Theory for Computing a Lower Bound on a;^ 

Although Proposition |4] provides ways to efficiently compute upper bounds on 
the subproblem ([42]) for fixed rj, it is not obvious whether we could use it to 
compute an upper bound on the positive fixed point of /s(??), or l/wo(Q,s). 
We show in this subsection that another iterative procedure can compute such 
upper bounds. 

To this end, we define functions gsAiv) ^-^id gsiv) over [0, oo), parameterized 
by s for s e (1, s*), 

gs^ii^l) nuns?] ^max||5ijl„ - PfQ-jW^^ + ||-Pj|l2, and 
gs{ri) =Taaxgs.i{'n)- (49) 

The following proposition, whose proof is given in Appendix |8.9[ lists some 
properties of gs.iiv) and gsiv)- 

Proposition 5 The functions gs,i{'n) 0,^(1 gs{T]) have the following properties: 

1- gs,i{'n) O'^d gsivi) are continuous in rj. 

2. gs.iiv) gsivi) are strictly increasing in rj. 

3. gs.iiv) '■s concave for every i. 

4- 3s (0) = 0, gsi"/]) > srj > 7] for sufficiently small rj > 0, and there exists 
p < 1 such that gs{ri) < prj for sufficiently large rj; the same holds for 

gs.ii'n)- 

5. gs^i and gs{vi) have unique positive fixed points rj* — gs,i{Vi) and rj* = 
gs{il*), respectively, and rj* =maxi77,*. 

6. For rj G (0, 77*), we have gs(rj) > rj, and for rj G (77*, 00), we have gsiv) < Vi 
the same statement holds also for gs,i{'n)- 

7. For any e > 0, there exists pi(e) > 1 such that gs{rj) > Pi{f)'rj as long as 
< r] < {1 — e)rj* , and there exists P2(e) < 1 such that gsiv) < P2{^)v as 
long as rj > (1 + e)rj* . 

The same properties in Proposition [5] hold for the functions defined below: 

hsAv) = minsT? ^max||(5,jl„ - Q^lb^ + ^ WP^h, and (50) 
hs{rj) = maxhs,i{rj). (51) 

i 

Proposition 6 The functions hs^i{rj) and hs{rj) have the following properties: 

1- hs i{rj) and hs{rj) are continuous in rj. 

"2- hs i{rj) and hsijj) are strictly increasing in rj. 

3. hs^i{rj) is concave for every i. 

4. hs{0) = 0, hs{rj) > srj > Tj for sufficiently small rj > 0, and there exists 
p < I such that hs(jj) < prj for sufficiently large rj; the same holds for 
hs,i{rj)- 
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5. hsa and hsirj) have unique positive fixed points rj* — hg^i{r]*) and rj* = 
hs {rf ) , respectively, and rf = maxi 77* . 

6. For rj £ {0,r]*), we have hs{r]) > rj, and for rj G (77*, 00), we have hs{rj) < rj; 
the same statement holds also for hs,i{rj). 

7. For any e > 0, there exists pi{e) > 1 such that hs(rj) > pi{t)rj as long as 
< 77 < (1 — e)rj* , and there exists P2(e) < 1 such that hs{rj) < P2(e)7? as 
long as rj > (1 + e)rj* . 

An immediate consequence of Propositions [5] and |6] is the following: 

Theorem 4 Suppose rj* is the unique fixed point of gs{rj) (hs{rj), resp.), then 
we have 

V* > — A—. ( TTrir^' '^esp. ) . (52) 

uj2(A,s) \uJhoa[A^ A,s) ) 

Proposition [5] implies three ways to compute the fixed point rj* for g^ (rj) . 
The same discussion is also valid for hgirj). 

1. Naive Fixed Point Iteration: Property 7) of Proposition[5]suggests that 
the fixed point iteration 

m+i ^ gs{vt),t = 0,1,... (53) 

starting from any initial point rjo converges to rj* , no matter whether rjQ < 
rj* or rjo > rj* . The algorithm can be made more efficient in the case 
rjo < rj* . More specifically, since gs{ri) = max^ (75^^(77), at each fixed point 
iteration, we set 7^4+1 to be the first gs^iivt) that is greater than rjt + e, 
with e being some tolerance parameter. If for all z, gsdivt) < Vt ~'r t, then 
dsi'Ht) = ^^'x^i gs,iivt) < + £j which indicates that the optimal function 
value can not be improved greatly and the algorithm should terminate. In 
most cases, to get rjt+i, we need to solve only one optimization problem, 
minp; srj (maxj ||(5ijl„ — Pi"Qj\\2) + \\Pi\\2, instead of solving for p. This is 
in contrast to the case where 770 > rj* , because in the latter case we must 
compute all gs.iirjt) to update rjt+i = max^ gs^i{rjt). An update based on a 
single gs,i{^t) might generate a value smaller than 77*. 
The naive fixed point iteration has two major disadvantages. First, the 
stopping criterion based on successive improvement is not accurate as it 
does not reflect the gap between 77^ and 77*. This disadvantage can be 
remedied by starting from both below and above 77*. The distance between 
corresponding terms in the two generated sequences is an indication of the 
gap to the flxed point 77*. However, the resulting algorithm is very slow, 
especially when updating 77t_|_i from above rj* . Second, the iteration process 
is slow, especially when close to the fixed point 77*, because pi(e) and P2{^) 
in 7) of Proposition [5] are close to 1. 

2. Bisection: The bisection approach is motivated by property 6) of Propo- 
sition [5j Starting from an initial interval (77l,77u) that contains 77*, we 
compute gsi'Uu) with rjM = iVh + ^u)/2. As a consequence of property 6), 
gsivu) > rjM implies 5s (w) < V* 1 and we set 77L = gsiVM); gsivu) < Vm 
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implies gsiiju) > ifj ^-nd we set 77U = (7s(?7m)- The bisection process can 
also be accelerated by setting ryL = gs.iiVM) for the first gs,i{'r(M) greater 
than T^M- The convergence of the bisection approach is much faster than 
the naive fixed point iteration because each iteration reduces the interval 
length at least by half. In addition, half the length of the interval is an 
upper bound on the gap between 77M and 77*, resulting an accurate stop- 
ping criterion. However, if the initial t^u is too much larger than 77*, the 
majority of gsi'nm) would turn out to be less than rf . The verification of 
gsiVM) < Vm requires solving p semidefinite programs, greatly degrading 
the algorithm's performance. 
3. Fixed Point Iteration + Bisection: The third approach combines the 
advantages of the bisection method and the fixed point iteration method, 
at the level of gs,i{il)- This method relies on the representations gs{v) = 
maxigs,i{r]) and 77* = max^ry*. 

Starting from an initial interval (77LO7 Vv) ^-^.d the index set Xq — {1, . . . ,p}, 
we pick any io £ Iq and use the (accelerated) bisection method with start- 
ing interval (77loj?7u) to find the positive fixed point 77*^ of gs,io(r]). For any 
i G Xq/iq, gs.iiVig) ^ Via implies that the fixed point 77* of gs,i{'>i) is less 
than or equal to 77*^ according to the continuity of gs,i{'n) and the unique- 
ness of its positive fixed point. As a consequence, we remove this i from the 
index set Xq- We denote Xi as the index set after all such is are removed, 
i.e., Xi = Xo/{i : gs.Avt) < vt^}- We also set 77^1 = 77*^ as ??* > 77*^. Next 
we test the ii G Xi with the largest gs,i(j]io) and construct X2 and 7;l2 in a 
similar manner. We repeat the process until the index set It is empty. The 
77* found at the last step is the maximal 77*, which is equal to 77*. 



6 Preliminary Numerical Experiments 

In this section, we present preliminary numerical results that assess the per- 
formance of the algorithms for verifying uj2{A, s) > and computing 012 s). 
We also compare the error bounds based on 0^2 s) with the bounds based on 
the block RIP [?]. The involved semidefinite programs are solved using CVX. 

We test the algorithms on Gaussian random matrices. The entries of Gaus- 
sian matrices are randomly generated from the standard Gaussian distribution. 
All m X np matrices are normalized to have columns of unit length. 



We first present the values of s* computed by ( 34 1 , and the values of /c* 
(A;* — [s*/2j), and compare them with the corresponding quantities when A 
is used as the sensing matrix for sparsity recovery without knowing the block- 
sparsity structure. The quantities in the latter case are computed using the 
algorithms developed in 20 21 . We note in Table [T] that for the same sensing 



matrix A, both s* and fc* are smaller when the block-sparse structure is taken 
into account than when it is not taken into account. However, we need to keep 
in mind that the true sparsity level in the block-sparse model is nk, where k 
is the block sparsity level. The nk^, in the fourth column for the block-sparse 
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model is indeed much greater than the fc* in the sixth column for the sparse 
model, implying that exploiting the block-sparsity structure is advantageous. 



Table 1 Comparison of the sparsity level bounds on block-sparsity recovery and sparsity 
recovery for a Gaussian sensing matrix A £ jj^ixnp _ 4 p _ gg 



m 


Block Sparse Model 


Sparse Model 


s. 


k. 


nfc» 


s* 




72 


3.96 


1 


4 


6.12 


3 


96 


4.87 


2 


8 


7.55 


3 


120 


5.94 


2 


8 


9.54 


4 


144 


7.14 


3 


12 


11.96 


5 


168 


8.60 


4 


16 


14.66 


7 


192 


11.02 


5 


20 


18.41 


9 



In the next set of experiments, we compare the computation times for 
the three implementation methods discussed at the end of Section |5.4| The 
Gaussian matrix A is of size 72 x 120 with n — 3 and p — 40. The tolerance 
parameter is lO^"". The initial rj value for the naive fixed point iteration is 
0.1. The initial lower bound 77L and upper bound 77U are set as 0.1 and 10, 
respectively. All three implementations yield rj* — 0.7034. The CPU times for 
the three methods are 393 seconds, 1309 seconds, and 265 seconds. Therefore, 
the fixed point iteration + bisection gives the most efficient implementation 
in general. The bisection search method is slow in this case because the initial 
rju is too much larger than 77*. 

In the last experiment, we compare our recovery error bounds on the BS- 
BP based on ijJ2{A,s) with those based on the block RIP. Recall that from 
Corollary [2l we have for the BS-BP 



2v 2fc 

*-a;||2< — , , ^,, £. (54) 



For comparison, the block RIP bounds is 



assuming the block RIP S2k {A) < \f2 — 1 1 7 . Without loss of generality, we 
set £ = 1. 

The block RIP is computed using Monte Carlo simulations. More explicitly, 
for (52fc(^), we randomly take 1000 sub-matrices of A G ]]j™x"P of size to x 
2nfc with a pattern determined by the block-sparsity structure, compute the 
maximal and minimal singular values a\ and tT2fcj and approximate <52fe(A) 
using the maximum of max((T^ — 1, 1 — (t|j,) among all sampled sub-matrices. 
Obviously, the approximated block RIP is always smaller than or equal to 
the exact block RIP. As a consequence, the performance bounds based on 
the exact block RIP are worse than those based on the approximated block 
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RIP. Therefore, in cases where our W2(A, 2k) based bounds are better (tighter, 
smaller) than the approximated block RIP bounds, they are even better than 
the exact block RIP bounds. 

In Table [2j we present the values of uj2{A,2k) and <52fc(A) computed for 
a Gaussian matrix A G M.™^"p with n = 4 and p = 60. The corresponding 
s* and fc* for different m are also included in the table. We note that in all 
the considered cases, S2k {A) > V2-I, and the block RIP based bound (ISSl) 
does not apply at all. In contrast, the 102 based bound (54) is valid as long as 
A; < A;*. In Table (ll we show the UJ2 based bound ((54|). 



Table 2 u}2{A,2k) and <52fc(A) computed for a Gaussian matrix A e Jj^xip with n = 4 
and p = 60. 





m 


72 


96 


120 


144 


168 


192 




St 


3.88 


4.78 


5.89 


7.02 


8.30 


10.80 


k 


k. 


1 


2 


2 


3 


4 


5 


1 


aJ2(A, 2k) 


0.45 


0.53 


0.57 


0.62 


0.65 


0.67 




0.90 


0.79 


0.66 


0.58 


0.55 


0.51 


2 


aJ2(A, 2k) 




0.13 


0.25 


0.33 


0.39 


0.43 




&2k{A) 




1.08 


0.98 


0.96 


0.84 


0.75 


3 


u)2{A, 2k) 








0.11 


0.18 


0.25 




&2k(A) 








1.12 


1.01 


0.93 


4 


uJ2{A, 2k) 










0.02 


0.12 




&2k{A) 










1.26 


1.07 


5 


uJ2{A, 2k) 

&2k{A) 












0.03 
1.28 



Table 3 The uJ2(A,2k) based bounds on the I2 norms of the errors of the BS-BP for the 
Gaussian matrix in Table (2] 





m 


72 


96 


120 


144 


168 


192 


s* 


3.88 


4.78 


5.89 


7.02 


8.30 


10.80 


k 


k. 


1 


2 


2 


3 


4 


5 


1 


UI2 bound 


6.22 


13.01 


9.89 


6.50 


11.52 


9.50 


2 


UI2 bound 




58.56 


25.37 


14.64 


7.30 


16.26 


3 


u)2 bound 








53.54 


21.63 


30.27 


4 


u)2 bound 










236.74 


23.25 


5 


u)2 bound 












127.59 



7 Conclusions 



In this paper, we analyzed the performance of convex block-sparse signal re- 
covery algorithms using the block-.^boo norm of the errors as a performance 
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criterion. We expressed other popular performance criteria in terms of the 
block-^boo norm. A family of goodness measures of the sensing matrices was 
defined using optimization procedures. We used these goodness measures to 
derive upper bounds on the block-£boo norms of the reconstruction errors for 
the Block-Sparse Basis Pursuit, the Block-Sparse Dantzig Selector, and the 
Block-Sparse LASSO estimator. Efficient algorithms based on fixed point it- 
eration, bisection, and semidefinite programming were implemented to solve 
the optimization procedures defining the goodness measures. We expect that 
these goodness measures will be useful in comparing different sensing systems 
and recovery algorithms, as well as in designing optimal sensing matrices. In 
future work, we will use these computable performance bounds to optimally 
design transmitting waveforms for compressive sensing radar. 



8 Appendix: Proofs 

8.f Proof of Proposition [T] 

Proof Suppose S = bsupp(a;) and |5| = ||a;||bo ~ k. Define the error vector 
h = X X. 

We first prove the proposition for the BS-BP and the BS-DS. The fact that 
||5;||bi = ||a; -|- /i||bi is the minimum among all zs satisfying the constraints 
in ([5]) and (|6]), together with the fact that the true signal x satisfies the con- 
straints as required by the conditions imposed on the noise in Proposition [T] 
implies that ||/isc||bi cannot be very large. To see this, note that 



||a;||bi > + /i||bi 



ies leS"^ 



ies ieS ieS" 



= ||a;s||bi - ll^sllbi + ll^s-llbi 
= ||a;||bi - \\hs\\hi + Whs-Whi- 



(56) 



Therefore, we obtain H/isHbi > ll^S'^llbii which leads to 



^WhsWbi > \\hs\\hi + \\hs4hi - ll^llbi. 



(57) 



We now turn to the BS-LASSO Q. Since the noise w satisfies ||A'^i(;||boo ^ 
K/i for some k G (0, 1), and x is the minimizer of ([7]), we have 



-\\Ax - y\\l + ^i\\x\\i,i < -\\Ax-y\\l+ ^i\\x\\i,i. 
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Consequently, substituting y = Ax + w yields 



Mll^llbi < ^\\Ax-y\\l- +^||a;||bi 
= - ^\\A{x -x)- +M||a;||bi 

^l\\w\\l-^\\A{x-x)\\l 



+ {A{x-x),w) - ^\\w\\l+n\\x\\ 

< {A{x-x),w) +^||a;||bi 
= {x - X, A^w) + ^||a;||bi. 

Using the Cauchy-Swcharz type inequality, we get 

Mpllbi < IIA - a;||bi||A^to||boo +M||a;||bi 



bl 



^^^J-\\h\\bl + fA\x\\ 



hi, 



which leads to 



\x\\hi < K||/i||bi + ||a;||bi- 



Therefore, similar to the argument in (56), we have 



> 



> 



X\\hl 

X\\bi - K||/l||bl 

x + hs-^ + hsWhi - K {\\hs-= + hsWhi) 

X + hs.= \\hi - \\hs\\hi - K{\\hs4hi + \\hs\\hi) 

x\\hl + (1 - K)\\hs4hl - (1 + '«)||^s||bi, 



where S — bsupp(a;). Consequently, we have 

1 - n 



\\hs\\ 



bl 



> 



1 + k' 



Ibi- 



Therefore, similar to (571, we obtain 



1 - K 



\\h 



S\\hl — 



S\\hl 



1 — K 
1 -J 



h 



S\\hl 



> 



1-k' 

1 + K 1 — K ,, , ,, 1 - K ,, , , 

hs-Whi + ll"-s||bi 



1 — K 1 + K 



(58) 
□ 
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8.2 Proof of Corollary [T] 

Proof Suppose S — bsupp(a;). According to Proposition [l| we have 

\\h\\hi < c\\hs\\hi < cA:||/i||boo. (59) 



To prove (13), we note 



Mlo. tt\\\hhoo 
^ f \\h^h 



< 



Svii/^ii 



bcx3 



^ ll^llbl 

ll^llboo 

< ck. (60) 

For the first inequality, we have used < 1 and a? < a for a e [0, 1]. 

For the last assertion, note that if ||/i||boo < /3/2, then we have for i e S', 

\\x^h = \\x^ + H\2 > \\x^\\■2 - \\h^h > /? - /3/2 = /3/2; (61) 
and for i ^ S*, 

\\x,\\2 = \\x, + h,\\2 = \\h,\\2<p/2. (62) 

□ 



8.3 Proof of Theorem [T] 

Proof Observe that for the BS-BP 

\\A{x - x)\\2 <\\y- Ax\\2 + \\y - Axh 
<e + \\Aw\\2 

< 2e, (63) 

and similarly, 

\\A^A{x-x)\\^^<2ii (64) 

for the BS-DS, and 

||A^v4(i;-a;)||boo < (1 + k)m (65) 
for the BS-LASSO. Here for the BS-LASSO, we have used 

\\A^{Ax-y)\\i,^ < H, (66) 
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a consequence of the optimality condition 

A^iAx-y)eiid\\x\\bi (67) 

and the fact that the ith block of any subgradient in 9||i;||bi is a;i/||5;i||2 if 
Xi and is g otherwise for some 119112 < 1- 

The conclusions of Theorem [T] follow from equation ( 12 ) and Definition 

m □ 

8.4 Proof of Lemma [T] 

Proof For any z such that ||2;||boo — 1 and ||2;||bi < s, we have 

zA^Az = {z,A'^Az) 

< \\z\U\A^Az\\^^ 

< sP^A^IIboo. (68) 



Taking the minima of both sides of (68) over {z : ||2;||boo — 1, < s} yields 



UJ2 {A, s) < SWboo 

{A^A,s). (69) 

For the other inequality, note that ||2;||bi/||2;||boo < s implies ||2;||bi < 
s||2:||boo < s\\z\\2, or equivalently, 

{z : ll^llbi/ll^llboo <s}g{z: ||2||bi/||2||2 < s}. (70) 
As a consequence, we have 

\\Azh \\zh 



U}2{A,S) = 



> 



l|bl/l|z||boo<s \\z\\2 ||-2:||bco 

\\Azh 
mm — — r — 

|bl/l|z||boo<S ||-Z||2 

\\Azh 



> min 

|2:||bl/||z||2<s ||-2||2 

= Ps-^{A), (71) 

where the first inequality is due to ||2;||2 > ||^||boo, and the second inequality 
is because the minimization is taken over a larger set. □ 
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8.5 Proof of Theorem [2] 



We first need some definitions. Suppose X is a scalar random variable, the 
Orlicz 'ip2 norm of X is defined as 



\X\ 



W2 



inf U > : Eexp 



< 2 



(72) 



The 'ip2 norm is closely related to the subgaussian constant L. We actually can 
equivalently define that a random vector X E M"p is isotropic and subgaussian 



with constant L if E| {X,u) p = ||m||2 and || {X,u) 



u G 20 



^^2 



< i||M||2 for any 



We use the notation i■^,{'H) = E sup„g^ {g,u) with g ^ A/'(0,I„p), i.e., a 
vector of independent zero-mean unit-variance Gaussians, to denote the Gaus- 
sian width of any set H C M"^. 



We now cite a result on the behavior of empirical processes 13 20 



Theorem 5 13 20 Let {a,ai,i = l,...,m} C M"^ be i.i.d. isotropic and 



subgaussian random vectors. H is a subset of the unit sphere ofW^P, and T = 
{.fu(-) = (m, •) : M G H}. Suppose diam(J', || • =^ max/^ggjr = a.. 

Then there exist absolute constants ci , C2 , C3 such that for any e > and m > 1 
satisfying 



m > ci 



with probability at least 1 



exp(— C2e2TO/a*), 
1 



Ef''"' -2. 

k=l 



f'iak)~Er{a) 



< e. 



(73) 



(74) 



Furthermore, if F is symmetric, we have 



Esup^g^ 



^ m 
nm ^— ^ 



afc)-Ef (a) 



r 4(H) £2(H)i 

< C3 max a — > . 

y/m m J 

With these preparations, we proceed to the proof of Theorem [2j 



(75) 



Proof (Proof of Theorem^ We apply Theorem 5 
CMSV. According to the assumptions of Theorem 2 
are i.i.d. isotropic and subguassian random vectors with constant L. Consider 



to estimate the block ^i- 
the rows {a^}™ 1 of ^JmA 



{u G 



l,||M|lbi — ■5} and the function set F 



the set % 

{fu{-) ~ {u,-) : It G H}. Clearly both H and are symmetric. The diameter 
of J- satisfies 



a — diam(J^, || • H^^) 
< 2sup„g^|| {u,a) 



I ^"2 



2L. 



(76) 
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Since E/^(a) = E (it, a) = ||it||2 = 1 when a follows the same distribution 



as {ai}^i, we observe that for any e € (0, 1 

.T aT 



Ps{Af 
is a consequence of 



min It-* Au < [l ~ e)^ < (1 - e) 



(77) 



It-* {^/rnAY {'s/mA)u - 1 



< e. 



(78) 



In view of Theorem [sj the key is to estimate the Gaussian width £^{'H) 
e^n) = E sup„^||„||^^ij|„||2^<, {g,u) 
<E||M||bi||g|lboo 

<^E|lg||boo. (79) 

The quantity E||g||boo can be bounded using Slepian's inequality flO' Section 
3.3]. We rearrange the vector g e M"^' into a matrix G € M"^p such that the 
vectorization vec(G) = g. Clearly, we have ||g|lboo = ||G||i^2, where || • ||i_2 
denotes the matrix norm as an operator from (M", || • H^^) to {W, \\ ■ jj^^). Rec- 
ognizing ||G'||i^2 — max^ggn-i ^ygT^p-i {Gv, w), wc define the Gaussian process 
X-u.-w = {Gv,w) indexed by (v^w) e S'""^ x TP-^. Here 5"-^ = {i; e M" : 
||i;|j2 = 1} and TP-^ = {w £ W : \\w\\i = 1}. We compare with an- 

other Gaussian process Y^j^^ = {^,v) + {C,w) , {v,w) g 5""^ x T^"^, where 
I ^ A/'(0, I„) and C ^ A/'(0, Ip). The Gaussian processes X^^-w and F^^^ satisfy 
the conditions for Slepian's inequality (See the proof of ^3| Theorem 32, page 
23]). Therefore, we have 



E||9| 



E max 
(i>,K))eS"-ixTp- 



(ii,K))eS"-ixTJ'-i 



= E max (£, -f E max (Cw) 

= E|]£||2+E||C||oo 

< Vn + v/logp. 



Here we have used 



E||€|| 



< 



11111 = 



due to Jensen's inequality, and 

EIICIU =EmaxC, < 



(80) 
(81) 
(82) 



a fact given by [T0| Equation 3.13, page 79]. 

The conclusion of Theorem |2] then follows from ( 79 1 , ( 80 ) , and suitable 
choice of Ci . □ 
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8.6 Proof of Proposition [2] 



Proof We rewrite the optimization ( 33 ) as 
1 



maxll^llboo s.t. Qz = 0,\\z\\i,i<l. (83) 



Note that in ( 83 ) , we are maximizing a convex function over a convex set, which 
is in general very difficult. We will use a relaxation technique to compute an 
upper bound on the optimal value of ( 83 ) . Define a matrix variable P of the 
same size as Q. Since the dual norm of || • ||boo is || • ||bi, we have 

max{||2;||boo : ||2;||bi <l,Qz = 0} 

z 

= maxiu^z : ||2;||bi < 1, ||M||bi <^,Qz = O] 

= max {u^{z - P'^Qz) : ||2;||bi < f , ||M||bi <l,Qz^ O} 

< max {tt^(I„p - P^Q)z : ||2;||bi < 1, ||M||bi < l} • (84) 

u.z 

In the last expression, we have dropped the constraint Qz = 0. Note that the 
unit ball {z : \\z\\]^i < 1} C E.'^p is the convex hull oi {e^ (g) v : 1 < i < p,v e 
K", ||i'||2 < 1} and u^{lnp — P^Q)z is convex (actually, linear) in z. As a 
consequence, we have 

max {m^(I„p - P'^Q)z : \\z\\^i < 1, ||M||bi < l} 
= max {m^(I„p - P^Q)(ef «) v) : ||tt||bi < 1, ||t;||2 < l} 
= maxmax{u^(I,,p - P^Q).v : ||it||bi < 1, ||t'||2 < l| 

j v.u ^ 

= maxmax{||(I„p - P^g)jM||2 : ||M||bi < l} , (85) 

where (I„p — P^Q)j denotes the jih column blocks of I„p — P^Q, namely, the 
submatrix of I„p — P^Q formed by the ((j — l)n + l)th to jnth columns. 

Applying the same argument to the unit ball {u : ||M||bi < 1} and the 
convex function ||(Inp — P'^Q)Ju\\2, we obtain 



max {m^(I„p - P^Q)z : ||2;||bi < 1, ||M||bi < l} 
max||(I„p-P^Q),,j||2. (86) 



Here (lnp — P^Q)i.j is the submatrix of I„p — P^(5 formed by the ((i — l)n+l)th 
to mth rows and the {{j— l)n + l)th to jnth columns, and || • ||2 is the spectral 
norm (the largest singular value). 
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Since P is arbitrary, the tightest upper bound is obtained by minimizing 
maxij ||(I„p — P'^Q)i.j\\2 with respect to P: 

1/s* = max{||2;||boo : \\z\\hi <^,Qz^ 0} 

< minmax||(I„„ - P^(3)ij||2 

P i,j 

= 1/5*. (87) 

Partition P and Q as P = [Pi, . . . , Pp] and Q — [Qi, . . . , Qp], with Pi and 
Qj having n columns each. We explicitly write 

(Inp — P^Q)i,j = ^ij^n — P^Qj^ (88) 

where 5ij = 1 for i = j and otherwise. As a consequence, we obtain 

minmax||(I„p - P^Q)i,,||2 
P i,j 

= min max max 1 1 (5i , I„ — Pf Q ?" 1 1 2 

Pi,...,Pp i j 

= maxminmax ||(5i,I„ — Pi^Qjlb- (89) 

i Pi j 

We have moved the max; to the outmost because for each i, maxj ||5ijln — 
Pi"Qj\\2 is a function of only Pi and does not depends on other variables 

PlJy^i. □ 



8.7 Proof of Proposition [3] 

Proof 1. Since in the optimization problem defining fsir]), the objective func- 
tion ||2;||boo is continuous, and the constraint correspondence 

C(ry) : [0,c5o) M"'' 

V^{z:\\Qz\\o<lA\z\\u<sT]} (90) 

is compact- valued and continuous (both upper and lower hemicontinuous) , 
according to Berge's Maximum Theorem [1 the optimal value function 
fsirj) is continuous. 
2. The monotone (non-strict) increasing property is obvious as increasing rj 
enlarges the region over which the maximization is taken. We now show 
the strict increasing property. Suppose < 771 < ?72, and /s(?7i) is achieved 
by zl ^ 0, namely, !s{j]\) = ll^Illboo, IIQ-zJllo < 1, and ||2;|||bi < s^x■ 
If llQ^illo < 1 (this implies H^iUbi = s?7i), we define Z2 = cz\ with c = 
min(l/||g2;*||o,s?72/||2tl|bi) > 1- We then have \\Qz2\U < 1, 11^2^1 < sm, 
and fs{r]2) > ||2;2||boo = c||^^l|boo > fsivi)- 

Consider the remaining case that HQ^iljo = 1. Without loss of general- 
ity, suppose ll^illboo = maxi<j<p ||2;ij||2 is achieved by the block z^i ^ 0. 
Since Qiz\i is linearly dependent with the columns of {Qj,j — 2, . . . ,p} 
(m is much less than (p — l)n), there exist {aj E IR"}^=2 such that 
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Qiz'li +Ej=2'3j"j = 0- Define a = [zlf , a'^ , ■ ■ ■ , a J] e satisfy- 
ing Qa. = and Z2 = zl + ccx ior c > sufficiently small such that 
IN2||bi < INilIbi + c||a||bi < sr?i + c||a||bi < sr]2. Clearly, \\Qz2\U = 
WQzl + cQa\\c = WQzlWc = 1. As a consequence, we have fs^Hi) > 

||;^2||boo > ||;a;2l||2 = {1+C)\\z\^h > W^thoo = fsim)- 

The case for r/i = is proved by continuity. 
3. Next wc show /s(r7) > s-q for sufficiently small > 0. Take z as the vector 
whose first element is 577 and zero otherwise. We have ||2;||bi = srj and 
ll^^llboo = > ?? (recall s G (l,s*)). In addition, when 77 > is sufficiently 
small, we also have HQ^Ho < 1- Therefore, for sufficiently small r], we have 
fsiv) >sT]>r]. 

We next prove the existence of rys > and ps € (0, 1) such that 

fsiv) < PbV, V > Vb- (91) 

We use contradiction to prove this statement. Suppose for all rjB > and 

Pb € (0, 1), there exists rj > rjB such that /s(?7) > PbV- Construct sequences 
{77^}^! C (0,00), {p^'''>}fLi C (0,1), and C such that 

lim 77^*'^ = 00, 

fc— >oo 

lim pf'^) = 1, 

fe— )-oo 

P^'V'^<W^) = \\z^'%oo, 

I10^^'^IU<1, 

||^(''||bi<S77«. (92) 

Decompose z*^*^^ = z[''^ +z^^\ where z['^'' is in the null space of Q and 2:2'^ ■* in 
the orthogonal complement of the null space of Q. The sequence {z2^^}^^i 
is bounded since cllz^''-' II < HQz^'"'' ||o < 1 where c = inf2.2_L„uii(g) ||Qz|U/||2|| > 
and jj-jj is any norm. Then 00 = limfe^.oo Hz^*"^ ||bcx) < limfc^.(x,(||2:i''^||bcx> + 
112:2'^^ llboo) implies {zi'^}'^^-^ is unbounded. For sufficiently large fc, we pro- 
ceed as follows: 

s ^ syC^^ ^ ||zW||bi 



(^)i/4 - pik)rjik) - \\z(k%^ 

^ ||4'-)||bi-||4'^||bi > /^X V4 ||4^)||bi 3 

"Il4'^l|boo + ||4'^||boo"^«*^ ||4''l|boo' 

where the second and last inequalities hold only for sufficiently large k and 
the last inequality is due to the unboundedness of {2:^'^^} and boundedness 
of {^2'^'}. As a consequence, we have 

ll^l'^^llbi < s ^ < ^* ^-^^ Q^(fe) ^ 0^ (94) 



4''l|bcx> 



which contradicts the definition of s* . 
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4. Next we show /s (77) has a unique positive fixed point ry* , which is equal to 

7* l/uj(^{Q,s). Properties 1) and 3) imply that there must be at least 
one fixed point. 

To show the uniqueness, we first prove 7* > 77* for any fixed point rj* = 
fsiv*)- Suppose z* achieves the optimal value of the optimization defining 
fsiv*), i-e-, 

V* = fsiv*) = \\z*\\hooA\Qz*\\o < 1, ll^llbi < S77*. (95) 
Since Ubi/H^;* ||boo < sv* N* ^ we have 

If -q* < 7*, we define 770 = (??*+ 7*) /2 and 

z" = argmax^ ^ll^jp^ s.t. \\Qz\\^ < 1, ||2;||boo > Vo, and 

ll^llbl 

^ll-^^ 1 1 boo /„_\ 

Suppose z** with — 1 achieves the optimal value of the opti- 

mization defining 7* = l/uj(^{Q, s). Clearly, ||2;**||boo = 7* > VOi which 
implies z** is a feasible point of the optimization defining z"^ and p. As a 
consequence, we have 

|p llbl 

Actually we will show that p > 1. If ||2;**||bi < s||2;**||boo, we are done. 
If not (i.e., |j2;**||bi — s||2;** ||boo), as illustrated in Figure [l] we consider 
I = ^z**, which satisfies 

\m\\o < ^ < 1, (99) 

lllllboo = ?7o, and (100) 
lllllbi = s?7o. (101) 

To get as shown in Figure [l] pick the block of ^ with smallest non-zero 
£2 norm, and scale that block by a small positive constant less than 1. 
Because s > 1, ^ has more than one non-zero blocks, implying ||^"||boo will 
remain the same. If the scaling constant is close enough to 1, ||(5^"||o will 
remain less than 1. But the good news is that ||^'^||bi decreases, and hence 
P > ^|||n[[^^ becomes greater than 1. 

Now we proceed to obtain the contradiction that fsiv*) > V* ■ If ll-^^'^llbi < 
s • 77* , then it is a feasible point of 

maxll^llboo s.t. IIQ^IU < IJNIlbi < s-?7*- (102) 
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As a consequence, fs{v*) ^ II •^'^ II boo > Vo > V* : contradicting that 77* is a 
fixed point, and we are done. If this is not the case, i.e., ||-2^||bi > s ■ rj* , 
we define a new point 

= Tz" (103) 

with 

r = < 1. (104) 

ll^llbi 

Note that 2;" is a feasible point of the optimization defining fsiv*) since 

||Q2"||o = r||Q^^|U < 1, and (105) 

||^"||bi=r||2^|lbi-s-77*. (106) 

Furthermore, we have 

||^"||boo =T||^^||boo =P?7*. (107) 

As a consequence, we obtain 

/.('7*)>P'?*>^*- (108) 



Therefore, for any positive fixed point 77* , we have rj* 
fixed point is unique. 
5. Property 5) is a consequence of 1), 3), and 4). 



= 7*, i.e., the positive 
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6. We demonstrate the existence of P2(e) only. The existence of pi(e) can be 
proved in a similar manner, and hence is omitted. 

We need to show that for fixed e > 0, there exists p{e) < 1 such that for 
any 77 > (1 + e)r]* we have 

fsiv) < P{e)v- (109) 

In view of ( [9l| ), we need to prove the above statement only for rj £ [(1 + 
€)r]*,r]B]- We use contradiction. Suppose for any p € (0,1) there exists 
77 S [(1 + e)r]*,riB] such that fsiv) > PV- Construct sequences {?7^'^^}fcLi C 
[(l + e)ry*,77B] and {p^j^^i C (0,1) with 

lim p'^''^ = 1, and 

/,(r,W) >p('=)r7^'=). (110) 

Due to the compactness of [(1 + e)ri*,r]B]-, there must exist a subsequence 
{^(fe,)}^^ of W'^^f^i such that limz^oo r?^'^') = r?iim for some rjiim € [(1 + 
e)rf,riB]- As a consequence of the continuity of fsifj), we have 
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/,(77ii„,) = lim fsiv^'"'') > lim pC^'),;^^') = r^n^,. (Ill) 

Again due to the continuity of fsii]) and the fact that fsiv) < V for rj > rfB, 
there exists ric G [ri\m\,VB\ such that 

/.('7c)=r?c, (112) 

contradicting the uniqueness of the fixed point for fs [rf) . 
The result impUes that starting from any initial point below the fixed point, 
through the iteration 774+1 = fsivt) we can approach an arbitrarily small 
neighborhood of the fixed point exponentially fast. 

□ 



8.8 Proof of Proposition |4] 

Proof Introducing an additional variable v such that Qz = v, wc have 
max{||2;||boo : ll^llbi < stj, \\Qz\\^ < 1} 

Z 

= max{||2;||boo : H^Hbi < s'q,Qz = v, \\v\\o < 1} 

Z.V 

= max {u^z : \\z\\i,i < srj, Qz = v, \\v\\^ < 1, ||M||bi < 1} 

z,v,u 

= max {u^{z - P^{Qz - v)) : ||2;||bi < srj.Qz = v, \\v\\^ < 1, ||ii||bi < l} 

z,v,u 

< max {u'^ilnp - P'^Q)z + u^P'^v : \\z\\bi < srj, \\v\\^ < 1, ||M||bi < l}ll3) 



where for the last inequality we have dropped the constraint Qz — v. Similar 
to (85), we bound from above the first term in the objective function of (113) 

max {m^(I„p - P'^Q)z : \\z\\hi < stj, \\u\\hi < l} 



maxs?7||(I„p - P'^Q)i.j\\2 

max.s??||^,yI„-PfQj||2. (114) 



Here Pi (resp. Qj) is the submatrix of P (resp. Q) formed by the ((i — l)n+l)th 
to inth columns (resp. ((j — l)n+ l)th to jnth columns). For the second term 
u^P^v in the objective function of (113), the definition of || • ||*, the dual 
norm of || • ||o, leads to 

maxju^P'^t- : ||M||bi < lA\v\\o < l] 
< max{|iPM||; : llixllbi < 1}. (115) 



Computable performance analysis of block-sparsity recovery 



31 



Since is a convex function of u and the extremal points of the unit 

ball {u : ||M||bi < 1} are {e^ (E) v : \\v\\2 < 1,1' € M"}, the above inequality is 
further bounded from above as 



max{||PM||; : ||M||bi < 1} 

U 

< maxmax{||P^?;||* : ||i^||2 ^ 1} — max ||Pi||2,*- 



(116) 



Here ||Pi||2,* is the operator norm of Pi from the space with norm || ■ II2 to the 
space with norm II • II*. A tight upper bound on max^ {II 2;|| boo ■ ll^llbi < s?], ||Q2;|| 
is obtained by minimizing with respect to P: 

max{||2||boo : H^Hbi < s??, IIQ^IIo < 1} 



< minmaxsr/||%I„ - Fi^(9j||2 + ||-Pi||2,^ 



= max min max 577 1 1 Sij In — P^ 



J 112 



Pi 3 

When Q — A and o = 2, the operator norm 
A and o — hoo, we have 

mh,* - ||^.||2,bl 



\P\ 



(117) 
(118) 



When Q 



max llPit'llbi 

•u: II vll 2< 1 



max 

ij: II 1; II 2 < 1 



P 

El 



(119) 



Here P/ is the submatrix of P formed by the {{i — l)n + l)th to mth columns 
and the {{I — l)n + l)th to /nth rows. Proposition |4] then follows from (1171, 
(fTT8|, and ([TT9I). □ 



8.9 Proof of Proposition [5] 

Proof 1. First note that adding an additional constraint ||Pi||2 < sij in the 
definition of gs.i does not change the definition, because gs,i{r]) < srj as 
easily seen by setting Pi = 0: 

gs.i = min{s77(max||%I„ - P^Qjh) + \\Pih ■ \\Pih < si]} 
Pi j 

= -max{-s77(max||%I„ - P^Qjh) - \\PM2 ■ < s^}-(120) 

Pi j 

Since the objective function to be maximized is continuous, and the con- 
straint correspondence 

C{rj)^{P,:\m2<srj} (121) 

is compact- valued and continuous (both upper and lower hemicontinuous) , 
according to Berge's Maximum Theorem [l] the optimal value function 
gs.iiv) is continuous. The continuity of 5s(f?) follows from the fact that 
finite maximization preserves the continuity. 
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2. To show the strict increasing property, suppose rji < 772 and P*2 achieves 
9s,i{il2)- Then we have 

9s,i{Vi) < sr]i (^ma.x\\SijIn - Pt2^Qj\\2^ + \\Pi2h 

< sm {raf^\\5ijln - P*lQj\\^ + ||n*2l|2 

= 9sAm)- (122) 

The strict increasing of Qs {rf) then follows immediately. 

3. The concavity of gs,i{ri) follows from the fact that gs,i{ri) is the minimiza- 
tion of a function of variables -q and Pj, and when Pj, the variable to be 
minimized, is fixed, the function is linear in rj. 

4. Next we show that when r? > is sufficiently small <?«(??) > sij. For any i, 
we have the following. 



9s,i(.v) = niins?7 f max ||(5ijl„ - P^^Qjjb j + ||P, 



*l|2 



>minsr,(l-||PfQ,j|2) + ||^'. 



Pi 



i\\2 



>mins7?(l-||P,||2||Q,||2) + ||Pi||2 
= sr] + min ||Pj||2 (1 - s^yllQilh) 

>s'n>r), (123) 

where the minimum of the last optimization is achieved at Pj = when 
r] < l/{s\\Qi\\2). Clearly, ^^(r?) = maxj 3^,^(7/) > sri > r] for such rj. 
Recall that 

— = maxminmax — PfQj\\2. (124) 

S* i Pi 3 

Suppose P* is the optimal solution for each minp. maxj — P/'(5j||2- 

For each i, we then have 

- > max \\5ijln - P^Qjh, (125) 
s* 3 

which implies 

9s,i{ri) = minsr/ ^max||%I„ - PfQj\\2^ + ||Pi||2 

< sr) (maj^ \\5ijln - PfQjW^ + \\P*h 
<f^+ 11^^112. (126) 
As a consequence, we obtain 

9s{v) = max5s,j(77) < — 77 + max ||ff ||2. (127) 
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Pick p e (s/s*, 1). Then, we have the following when rj > max, \\Pi\\2/{p — 
s/s*): 

gsir])<PV- (128) 

5. We first show the existence and uniqueness of the positive fixed points for 
9s.i{v)- Properties 1) and 4) imply that gs.iiv) h^is at least one positive 
fixed point. (Interestingly, 2) and 4) also imply the existence of a positive 
fixed point, see [22].) To prove uniqueness, suppose there are two fixed 
points < tjI < 773. Pick ijq small enough such that gs,i(xjo) > Vo > and 
770 < 7?i. Then r]l = Xrjo + (1 — X)V2 foi' some A G (0, 1), which implies that 
9s,i{Vi) > ^dsAVo) + (1 - A)gs,i(772) > At^o + (1 - >^)V2 = Vi due to the 
concavity, contradicting r/l = gs,ii'r/i)- 

The set of positive fixed points for gsi^i), {rj G (0,oo) : 77 = gsii]) — 
ma.Xigs.i{Tj)}, is a subset of Ur=i{^ € (0, 00) : 77 = gs.iiv)} = {Vi}^=i- We 
argue that 

77*= max 77* (129) 

i 

is the unique positive fixed point for gsii]). 

We proceed to show that 77* is a fixed point of gs{i])- Suppose rj* is a fixed 
point of gs,io(j])- Then it suffices to show that gs(jl*) = ^bjci gg^tij]*) = 
ffs,io(^*)- If this is not the case, there exists ii 7^ ia such that 5s, 11(77*) > 
9s,ioiv*) — V* ■ The continuity of 55,11(77) and the property 4) imply that 
there exists 77 > 77* with 55,11(77) = 77, contradicting the definition of 77*. 
To show the uniqueness, suppose 77^' is fixed point of gs.iiiv) satisfying 
77^^ < 77*. Then, we must have gs,io(,Vi) > gs,iiiVi)^ because otherwise the 
continuity implies the existence of another fixed point of 55,20(77). As a 
consequence, gsivi) > dsSiiVi) = Vi and 77^ is not a fixed point of 55(77). 

6. This property simply follows from the continuity, the uniqueness, and prop- 
erty 4). 

7. We use contradiction to show the existence of pi(e) in 7). In view of 4), 
we need only to show the existence of such a pi{e) that works for 77^ < 
77 < (1 — e)77* where 77^ — sup{77 : 55 (^) > s^,VO < ^ < 77}. Supposing 
otherwise, we then construct sequences {77^'^^}^;^ C [77l,(1 — e)?7*] and 

M'^}r=ic(i,oo) with 

lini p^^^ = 1, and 

k^oo 

55(^('^)<P^'^^7^''. (130) 

Due to the compactness of [77^, (1 — 6)77*], there must exist a subsequence 
|j^(fe,)}^^ Qf {jyC^)} such that lim/_^.oo = 77iim for some ?7iini G [77^, (1 — 
e)r7*]. As a consequence of the continuity of gsifj), we have 

55(77ii„i) = lim 55(77('='^) < lim pJ^'^r/C^') = r/u.^. (131) 
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Again due to the continuity of gsi"/]) and the fact that gs{ri) < r] for j] < rj^, 
there exists ijc € [vL^VVim] such that 

gsiVc) = Vc, (132) 

contradicting the uniqueness of the fixed point for (77) . The existence of 
P2(e) can be proved in a similar manner. 

□ 
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